Overview

Dataset statistics

Number of variables19
Number of observations2550
Missing cells1176
Missing cells (%)2.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory378.6 KiB
Average record size in memory152.1 B

Variable types

Numeric11
Categorical8

Alerts

LABEL2013 has constant value "Urban" Constant
LABEL2014 has constant value "Urban" Constant
LABEL2015 has constant value "Urban" Constant
LABEL2016 has constant value "Urban" Constant
LABEL2017 has constant value "Urban" Constant
LABEL2018 has constant value "Urban" Constant
LABEL2019 has constant value "Urban" Constant
LABEL2020 has constant value "Urban" Constant
df_index is highly correlated with LAT and 4 other fieldsHigh correlation
LON is highly correlated with df_index and 8 other fieldsHigh correlation
2013 is highly correlated with df_index and 8 other fieldsHigh correlation
2014 is highly correlated with df_index and 9 other fieldsHigh correlation
2015 is highly correlated with df_index and 8 other fieldsHigh correlation
2016 is highly correlated with LON and 7 other fieldsHigh correlation
2017 is highly correlated with LON and 7 other fieldsHigh correlation
2018 is highly correlated with LON and 7 other fieldsHigh correlation
2019 is highly correlated with LON and 7 other fieldsHigh correlation
2020 is highly correlated with 2013 and 6 other fieldsHigh correlation
LABEL2020 is highly correlated with LABEL2017 and 6 other fieldsHigh correlation
LABEL2017 is highly correlated with LABEL2020 and 6 other fieldsHigh correlation
LABEL2016 is highly correlated with LABEL2020 and 6 other fieldsHigh correlation
LABEL2018 is highly correlated with LABEL2020 and 6 other fieldsHigh correlation
LABEL2019 is highly correlated with LABEL2020 and 6 other fieldsHigh correlation
LABEL2014 is highly correlated with LABEL2020 and 6 other fieldsHigh correlation
LABEL2013 is highly correlated with LABEL2020 and 6 other fieldsHigh correlation
LABEL2015 is highly correlated with LABEL2020 and 6 other fieldsHigh correlation
LAT is highly correlated with df_index and 2 other fieldsHigh correlation
LABEL2013 has 230 (9.0%) missing values Missing
LABEL2014 has 200 (7.8%) missing values Missing
LABEL2015 has 194 (7.6%) missing values Missing
LABEL2016 has 190 (7.5%) missing values Missing
LABEL2017 has 179 (7.0%) missing values Missing
LABEL2018 has 118 (4.6%) missing values Missing
LABEL2019 has 65 (2.5%) missing values Missing
df_index has unique values Unique

Reproduction

Analysis started2022-09-22 15:21:57.776620
Analysis finished2022-09-22 15:22:14.502286
Duration16.73 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct2550
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12901.11804
Minimum107
Maximum25800
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.0 KiB
2022-09-22T20:52:14.568599image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum107
5-th percentile6984.45
Q110530.25
median12940.5
Q315419.75
95-th percentile18046.1
Maximum25800
Range25693
Interquartile range (IQR)4889.5

Descriptive statistics

Standard deviation3937.916179
Coefficient of variation (CV)0.3052383651
Kurtosis1.662250451
Mean12901.11804
Median Absolute Deviation (MAD)2462
Skewness-0.1172750626
Sum32897851
Variance15507183.83
MonotonicityStrictly increasing
2022-09-22T20:52:14.698007image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1071
 
< 0.1%
145501
 
< 0.1%
145431
 
< 0.1%
145441
 
< 0.1%
145451
 
< 0.1%
145461
 
< 0.1%
145471
 
< 0.1%
145481
 
< 0.1%
145491
 
< 0.1%
145511
 
< 0.1%
Other values (2540)2540
99.6%
ValueCountFrequency (%)
1071
< 0.1%
1081
< 0.1%
1431
< 0.1%
2461
< 0.1%
2471
< 0.1%
3471
< 0.1%
3481
< 0.1%
4251
< 0.1%
4261
< 0.1%
4271
< 0.1%
ValueCountFrequency (%)
258001
< 0.1%
257741
< 0.1%
254731
< 0.1%
254551
< 0.1%
254531
< 0.1%
253991
< 0.1%
253821
< 0.1%
253811
< 0.1%
253801
< 0.1%
253791
< 0.1%

LAT
Real number (ℝ≥0)

HIGH CORRELATION

Distinct100
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.43357255
Minimum17.0275
Maximum17.7475
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.0 KiB
2022-09-22T20:52:14.828704image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum17.0275
5-th percentile17.3075
Q117.3775
median17.4425
Q317.4975
95-th percentile17.5425
Maximum17.7475
Range0.72
Interquartile range (IQR)0.12

Descriptive statistics

Standard deviation0.08394416492
Coefficient of variation (CV)0.004815086792
Kurtosis1.99202693
Mean17.43357255
Median Absolute Deviation (MAD)0.06
Skewness-0.6344688667
Sum44455.61
Variance0.007046622824
MonotonicityNot monotonic
2022-09-22T20:52:14.949910image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17.507565
 
2.5%
17.497562
 
2.4%
17.512562
 
2.4%
17.502561
 
2.4%
17.452561
 
2.4%
17.492561
 
2.4%
17.482560
 
2.4%
17.447560
 
2.4%
17.457559
 
2.3%
17.462559
 
2.3%
Other values (90)1940
76.1%
ValueCountFrequency (%)
17.02752
0.1%
17.06252
0.1%
17.06754
0.2%
17.07254
0.2%
17.07754
0.2%
17.08251
 
< 0.1%
17.08751
 
< 0.1%
17.09752
0.1%
17.12752
0.1%
17.15251
 
< 0.1%
ValueCountFrequency (%)
17.74751
 
< 0.1%
17.74252
 
0.1%
17.68751
 
< 0.1%
17.64253
0.1%
17.63755
0.2%
17.63256
0.2%
17.62754
0.2%
17.62255
0.2%
17.61756
0.2%
17.61255
0.2%

LON
Real number (ℝ≥0)

HIGH CORRELATION

Distinct139
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean78.465233
Minimum78.0475
Maximum78.92751
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.0 KiB
2022-09-22T20:52:15.081059image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum78.0475
5-th percentile78.28751
Q178.3975
median78.4675
Q378.53751
95-th percentile78.6125
Maximum78.92751
Range0.88001
Interquartile range (IQR)0.14001

Descriptive statistics

Standard deviation0.1193077387
Coefficient of variation (CV)0.001520517229
Kurtosis2.605373412
Mean78.465233
Median Absolute Deviation (MAD)0.07001
Skewness-0.008685703397
Sum200086.3441
Variance0.01423433651
MonotonicityIncreasing
2022-09-22T20:52:15.201873image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
78.4275153
 
2.1%
78.432553
 
2.1%
78.487552
 
2.0%
78.4825150
 
2.0%
78.462550
 
2.0%
78.422549
 
1.9%
78.477549
 
1.9%
78.417548
 
1.9%
78.437548
 
1.9%
78.517548
 
1.9%
Other values (129)2050
80.4%
ValueCountFrequency (%)
78.04752
 
0.1%
78.052511
 
< 0.1%
78.06252
 
0.1%
78.06752
 
0.1%
78.07256
0.2%
78.077511
0.4%
78.08256
0.2%
78.08755
0.2%
78.11251
 
< 0.1%
78.11752
 
0.1%
ValueCountFrequency (%)
78.927511
 
< 0.1%
78.92251
 
< 0.1%
78.90251
 
< 0.1%
78.89753
 
0.1%
78.89258
0.3%
78.88757
0.3%
78.88257
0.3%
78.87755
0.2%
78.872513
 
0.1%
78.83751
 
< 0.1%

2013
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2414
Distinct (%)94.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3287112235
Minimum0.16032
Maximum0.58204
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.0 KiB
2022-09-22T20:52:15.328134image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.16032
5-th percentile0.2175865
Q10.27691
median0.323695
Q30.3755075
95-th percentile0.4574415
Maximum0.58204
Range0.42172
Interquartile range (IQR)0.0985975

Descriptive statistics

Standard deviation0.07074401917
Coefficient of variation (CV)0.2152163179
Kurtosis-0.2585913859
Mean0.3287112235
Median Absolute Deviation (MAD)0.049015
Skewness0.2833181672
Sum838.21362
Variance0.005004716248
MonotonicityNot monotonic
2022-09-22T20:52:15.447705image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.286164
 
0.2%
0.273463
 
0.1%
0.418293
 
0.1%
0.330723
 
0.1%
0.296923
 
0.1%
0.226392
 
0.1%
0.292342
 
0.1%
0.213372
 
0.1%
0.395622
 
0.1%
0.327852
 
0.1%
Other values (2404)2524
99.0%
ValueCountFrequency (%)
0.160321
< 0.1%
0.163021
< 0.1%
0.167141
< 0.1%
0.16731
< 0.1%
0.167441
< 0.1%
0.171281
< 0.1%
0.171911
< 0.1%
0.173811
< 0.1%
0.174991
< 0.1%
0.176861
< 0.1%
ValueCountFrequency (%)
0.582041
< 0.1%
0.569361
< 0.1%
0.548411
< 0.1%
0.538171
< 0.1%
0.533541
< 0.1%
0.529911
< 0.1%
0.527311
< 0.1%
0.526461
< 0.1%
0.520121
< 0.1%
0.518691
< 0.1%

2014
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2431
Distinct (%)95.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.323322102
Minimum0.16165
Maximum0.58526
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.0 KiB
2022-09-22T20:52:15.577313image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.16165
5-th percentile0.2175325
Q10.272355
median0.318215
Q30.3669125
95-th percentile0.448162
Maximum0.58526
Range0.42361
Interquartile range (IQR)0.0945575

Descriptive statistics

Standard deviation0.06898725546
Coefficient of variation (CV)0.2133700574
Kurtosis-0.07208338358
Mean0.323322102
Median Absolute Deviation (MAD)0.047255
Skewness0.395791906
Sum824.47136
Variance0.004759241416
MonotonicityNot monotonic
2022-09-22T20:52:15.708578image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.382544
 
0.2%
0.273083
 
0.1%
0.27463
 
0.1%
0.271093
 
0.1%
0.339693
 
0.1%
0.310842
 
0.1%
0.294092
 
0.1%
0.347522
 
0.1%
0.317542
 
0.1%
0.333712
 
0.1%
Other values (2421)2524
99.0%
ValueCountFrequency (%)
0.161651
< 0.1%
0.163781
< 0.1%
0.165931
< 0.1%
0.166851
< 0.1%
0.168551
< 0.1%
0.169611
< 0.1%
0.170391
< 0.1%
0.170961
< 0.1%
0.175181
< 0.1%
0.176181
< 0.1%
ValueCountFrequency (%)
0.585261
< 0.1%
0.580461
< 0.1%
0.542431
< 0.1%
0.527571
< 0.1%
0.52441
< 0.1%
0.521591
< 0.1%
0.520571
< 0.1%
0.519031
< 0.1%
0.518981
< 0.1%
0.517541
< 0.1%

2015
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2408
Distinct (%)94.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3209652549
Minimum0.16686
Maximum0.5692
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.0 KiB
2022-09-22T20:52:15.839183image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.16686
5-th percentile0.2187695
Q10.2730075
median0.316755
Q30.3651425
95-th percentile0.439987
Maximum0.5692
Range0.40234
Interquartile range (IQR)0.092135

Descriptive statistics

Standard deviation0.06598958445
Coefficient of variation (CV)0.2055972833
Kurtosis-0.173715908
Mean0.3209652549
Median Absolute Deviation (MAD)0.04584
Skewness0.3201133772
Sum818.4614
Variance0.004354625255
MonotonicityNot monotonic
2022-09-22T20:52:15.963595image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.31643
 
0.1%
0.403733
 
0.1%
0.270543
 
0.1%
0.367933
 
0.1%
0.277983
 
0.1%
0.358683
 
0.1%
0.304013
 
0.1%
0.409933
 
0.1%
0.333493
 
0.1%
0.363182
 
0.1%
Other values (2398)2521
98.9%
ValueCountFrequency (%)
0.166861
< 0.1%
0.167881
< 0.1%
0.169211
< 0.1%
0.171731
< 0.1%
0.172371
< 0.1%
0.173971
< 0.1%
0.174551
< 0.1%
0.176111
< 0.1%
0.176921
< 0.1%
0.177681
< 0.1%
ValueCountFrequency (%)
0.56921
< 0.1%
0.553311
< 0.1%
0.533641
< 0.1%
0.526821
< 0.1%
0.526341
< 0.1%
0.519791
< 0.1%
0.516081
< 0.1%
0.503351
< 0.1%
0.501831
< 0.1%
0.501181
< 0.1%

2016
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2415
Distinct (%)94.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.306381
Minimum0.15874
Maximum0.54005
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.0 KiB
2022-09-22T20:52:16.098556image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.15874
5-th percentile0.209906
Q10.2605125
median0.300795
Q30.3490625
95-th percentile0.421138
Maximum0.54005
Range0.38131
Interquartile range (IQR)0.08855

Descriptive statistics

Standard deviation0.06319287678
Coefficient of variation (CV)0.2062558605
Kurtosis-0.2428108851
Mean0.306381
Median Absolute Deviation (MAD)0.04434
Skewness0.3544407113
Sum781.27155
Variance0.003993339676
MonotonicityNot monotonic
2022-09-22T20:52:16.223446image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.347553
 
0.1%
0.349843
 
0.1%
0.354043
 
0.1%
0.291073
 
0.1%
0.278483
 
0.1%
0.280843
 
0.1%
0.329013
 
0.1%
0.28092
 
0.1%
0.342122
 
0.1%
0.389922
 
0.1%
Other values (2405)2523
98.9%
ValueCountFrequency (%)
0.158741
< 0.1%
0.160071
< 0.1%
0.161731
< 0.1%
0.162351
< 0.1%
0.163361
< 0.1%
0.164371
< 0.1%
0.166041
< 0.1%
0.166191
< 0.1%
0.167861
< 0.1%
0.172311
< 0.1%
ValueCountFrequency (%)
0.540051
< 0.1%
0.531221
< 0.1%
0.522331
< 0.1%
0.500761
< 0.1%
0.500471
< 0.1%
0.495071
< 0.1%
0.487521
< 0.1%
0.482451
< 0.1%
0.48061
< 0.1%
0.474571
< 0.1%

2017
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2429
Distinct (%)95.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3234497882
Minimum0.16212
Maximum0.58771
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.0 KiB
2022-09-22T20:52:16.359641image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.16212
5-th percentile0.2167565
Q10.2707375
median0.317835
Q30.37045
95-th percentile0.4543505
Maximum0.58771
Range0.42559
Interquartile range (IQR)0.0997125

Descriptive statistics

Standard deviation0.07130246953
Coefficient of variation (CV)0.2204437045
Kurtosis-0.09581256326
Mean0.3234497882
Median Absolute Deviation (MAD)0.049665
Skewness0.4227081944
Sum824.79696
Variance0.005084042161
MonotonicityNot monotonic
2022-09-22T20:52:16.477763image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.299512
 
0.1%
0.266052
 
0.1%
0.248872
 
0.1%
0.363012
 
0.1%
0.392092
 
0.1%
0.255682
 
0.1%
0.358372
 
0.1%
0.282682
 
0.1%
0.3472
 
0.1%
0.262772
 
0.1%
Other values (2419)2530
99.2%
ValueCountFrequency (%)
0.162121
< 0.1%
0.163841
< 0.1%
0.16451
< 0.1%
0.165381
< 0.1%
0.165641
< 0.1%
0.169041
< 0.1%
0.17111
< 0.1%
0.17251
< 0.1%
0.172861
< 0.1%
0.174361
< 0.1%
ValueCountFrequency (%)
0.587711
< 0.1%
0.566421
< 0.1%
0.563521
< 0.1%
0.561231
< 0.1%
0.556491
< 0.1%
0.549091
< 0.1%
0.535131
< 0.1%
0.531771
< 0.1%
0.527642
0.1%
0.526921
< 0.1%

2018
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2422
Distinct (%)95.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3117158863
Minimum0.161
Maximum0.56912
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.0 KiB
2022-09-22T20:52:16.602942image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.161
5-th percentile0.213819
Q10.263945
median0.30481
Q30.354005
95-th percentile0.4345975
Maximum0.56912
Range0.40812
Interquartile range (IQR)0.09006

Descriptive statistics

Standard deviation0.06563081211
Coefficient of variation (CV)0.2105468954
Kurtosis-0.009656848504
Mean0.3117158863
Median Absolute Deviation (MAD)0.04426
Skewness0.4847883394
Sum794.87551
Variance0.004307403498
MonotonicityNot monotonic
2022-09-22T20:52:16.729500image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.331943
 
0.1%
0.306373
 
0.1%
0.256033
 
0.1%
0.264132
 
0.1%
0.297492
 
0.1%
0.286032
 
0.1%
0.310952
 
0.1%
0.262842
 
0.1%
0.293272
 
0.1%
0.292452
 
0.1%
Other values (2412)2527
99.1%
ValueCountFrequency (%)
0.1611
< 0.1%
0.161281
< 0.1%
0.163571
< 0.1%
0.164761
< 0.1%
0.166071
< 0.1%
0.166521
< 0.1%
0.169651
< 0.1%
0.170321
< 0.1%
0.170731
< 0.1%
0.171091
< 0.1%
ValueCountFrequency (%)
0.569121
< 0.1%
0.555911
< 0.1%
0.540691
< 0.1%
0.51661
< 0.1%
0.514631
< 0.1%
0.506171
< 0.1%
0.506081
< 0.1%
0.503821
< 0.1%
0.503221
< 0.1%
0.502681
< 0.1%

2019
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2404
Distinct (%)94.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3095043961
Minimum0.16108
Maximum0.55003
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.0 KiB
2022-09-22T20:52:17.032196image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.16108
5-th percentile0.2153725
Q10.2632075
median0.301755
Q30.3505525
95-th percentile0.43216
Maximum0.55003
Range0.38895
Interquartile range (IQR)0.087345

Descriptive statistics

Standard deviation0.06509410737
Coefficient of variation (CV)0.2103172304
Kurtosis0.1722441244
Mean0.3095043961
Median Absolute Deviation (MAD)0.042895
Skewness0.5751257865
Sum789.23621
Variance0.004237242814
MonotonicityNot monotonic
2022-09-22T20:52:17.159071image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.291543
 
0.1%
0.270343
 
0.1%
0.323413
 
0.1%
0.340813
 
0.1%
0.322263
 
0.1%
0.271423
 
0.1%
0.264643
 
0.1%
0.322723
 
0.1%
0.335073
 
0.1%
0.404323
 
0.1%
Other values (2394)2520
98.8%
ValueCountFrequency (%)
0.161081
< 0.1%
0.163121
< 0.1%
0.163291
< 0.1%
0.164481
< 0.1%
0.166021
< 0.1%
0.167011
< 0.1%
0.169411
< 0.1%
0.169641
< 0.1%
0.169811
< 0.1%
0.169881
< 0.1%
ValueCountFrequency (%)
0.550031
< 0.1%
0.537061
< 0.1%
0.534321
< 0.1%
0.532381
< 0.1%
0.529871
< 0.1%
0.52151
< 0.1%
0.515461
< 0.1%
0.515251
< 0.1%
0.515091
< 0.1%
0.515071
< 0.1%

2020
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2401
Distinct (%)94.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3386187373
Minimum0.17588
Maximum0.58955
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.0 KiB
2022-09-22T20:52:17.296036image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.17588
5-th percentile0.2370135
Q10.28543
median0.32747
Q30.3826475
95-th percentile0.475871
Maximum0.58955
Range0.41367
Interquartile range (IQR)0.0972175

Descriptive statistics

Standard deviation0.07226835371
Coefficient of variation (CV)0.2134210124
Kurtosis-0.01108442795
Mean0.3386187373
Median Absolute Deviation (MAD)0.04702
Skewness0.5884365333
Sum863.47778
Variance0.005222714948
MonotonicityNot monotonic
2022-09-22T20:52:17.420343image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.302223
 
0.1%
0.285773
 
0.1%
0.306893
 
0.1%
0.285463
 
0.1%
0.309623
 
0.1%
0.309233
 
0.1%
0.318772
 
0.1%
0.366272
 
0.1%
0.452222
 
0.1%
0.366812
 
0.1%
Other values (2391)2524
99.0%
ValueCountFrequency (%)
0.175881
< 0.1%
0.179391
< 0.1%
0.179481
< 0.1%
0.181241
< 0.1%
0.182631
< 0.1%
0.18571
< 0.1%
0.18611
< 0.1%
0.186181
< 0.1%
0.187161
< 0.1%
0.187361
< 0.1%
ValueCountFrequency (%)
0.589551
< 0.1%
0.572051
< 0.1%
0.56841
< 0.1%
0.562861
< 0.1%
0.562361
< 0.1%
0.562241
< 0.1%
0.560121
< 0.1%
0.559771
< 0.1%
0.557941
< 0.1%
0.552181
< 0.1%

LABEL2013
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing230
Missing (%)9.0%
Memory size20.0 KiB
Urban
2320 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters11600
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUrban
2nd rowUrban
3rd rowUrban
4th rowUrban
5th rowUrban

Common Values

ValueCountFrequency (%)
Urban2320
91.0%
(Missing)230
 
9.0%

Length

2022-09-22T20:52:17.528802image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-22T20:52:17.612428image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
urban2320
100.0%

Most occurring characters

ValueCountFrequency (%)
U2320
20.0%
r2320
20.0%
b2320
20.0%
a2320
20.0%
n2320
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter9280
80.0%
Uppercase Letter2320
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r2320
25.0%
b2320
25.0%
a2320
25.0%
n2320
25.0%
Uppercase Letter
ValueCountFrequency (%)
U2320
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin11600
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U2320
20.0%
r2320
20.0%
b2320
20.0%
a2320
20.0%
n2320
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII11600
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U2320
20.0%
r2320
20.0%
b2320
20.0%
a2320
20.0%
n2320
20.0%

LABEL2014
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing200
Missing (%)7.8%
Memory size20.0 KiB
Urban
2350 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters11750
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUrban
2nd rowUrban
3rd rowUrban
4th rowUrban
5th rowUrban

Common Values

ValueCountFrequency (%)
Urban2350
92.2%
(Missing)200
 
7.8%

Length

2022-09-22T20:52:17.682067image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-22T20:52:17.765908image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
urban2350
100.0%

Most occurring characters

ValueCountFrequency (%)
U2350
20.0%
r2350
20.0%
b2350
20.0%
a2350
20.0%
n2350
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter9400
80.0%
Uppercase Letter2350
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r2350
25.0%
b2350
25.0%
a2350
25.0%
n2350
25.0%
Uppercase Letter
ValueCountFrequency (%)
U2350
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin11750
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U2350
20.0%
r2350
20.0%
b2350
20.0%
a2350
20.0%
n2350
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII11750
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U2350
20.0%
r2350
20.0%
b2350
20.0%
a2350
20.0%
n2350
20.0%

LABEL2015
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing194
Missing (%)7.6%
Memory size20.0 KiB
Urban
2356 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters11780
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUrban
2nd rowUrban
3rd rowUrban
4th rowUrban
5th rowUrban

Common Values

ValueCountFrequency (%)
Urban2356
92.4%
(Missing)194
 
7.6%

Length

2022-09-22T20:52:17.866494image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-22T20:52:17.993784image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
urban2356
100.0%

Most occurring characters

ValueCountFrequency (%)
U2356
20.0%
r2356
20.0%
b2356
20.0%
a2356
20.0%
n2356
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter9424
80.0%
Uppercase Letter2356
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r2356
25.0%
b2356
25.0%
a2356
25.0%
n2356
25.0%
Uppercase Letter
ValueCountFrequency (%)
U2356
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin11780
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U2356
20.0%
r2356
20.0%
b2356
20.0%
a2356
20.0%
n2356
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII11780
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U2356
20.0%
r2356
20.0%
b2356
20.0%
a2356
20.0%
n2356
20.0%

LABEL2016
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing190
Missing (%)7.5%
Memory size20.0 KiB
Urban
2360 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters11800
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUrban
2nd rowUrban
3rd rowUrban
4th rowUrban
5th rowUrban

Common Values

ValueCountFrequency (%)
Urban2360
92.5%
(Missing)190
 
7.5%

Length

2022-09-22T20:52:18.071337image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-22T20:52:18.185653image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
urban2360
100.0%

Most occurring characters

ValueCountFrequency (%)
U2360
20.0%
r2360
20.0%
b2360
20.0%
a2360
20.0%
n2360
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter9440
80.0%
Uppercase Letter2360
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r2360
25.0%
b2360
25.0%
a2360
25.0%
n2360
25.0%
Uppercase Letter
ValueCountFrequency (%)
U2360
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin11800
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U2360
20.0%
r2360
20.0%
b2360
20.0%
a2360
20.0%
n2360
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII11800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U2360
20.0%
r2360
20.0%
b2360
20.0%
a2360
20.0%
n2360
20.0%

LABEL2017
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing179
Missing (%)7.0%
Memory size20.0 KiB
Urban
2371 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters11855
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUrban
2nd rowUrban
3rd rowUrban
4th rowUrban
5th rowUrban

Common Values

ValueCountFrequency (%)
Urban2371
93.0%
(Missing)179
 
7.0%

Length

2022-09-22T20:52:18.256909image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-22T20:52:18.345386image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
urban2371
100.0%

Most occurring characters

ValueCountFrequency (%)
U2371
20.0%
r2371
20.0%
b2371
20.0%
a2371
20.0%
n2371
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter9484
80.0%
Uppercase Letter2371
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r2371
25.0%
b2371
25.0%
a2371
25.0%
n2371
25.0%
Uppercase Letter
ValueCountFrequency (%)
U2371
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin11855
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U2371
20.0%
r2371
20.0%
b2371
20.0%
a2371
20.0%
n2371
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII11855
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U2371
20.0%
r2371
20.0%
b2371
20.0%
a2371
20.0%
n2371
20.0%

LABEL2018
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing118
Missing (%)4.6%
Memory size20.0 KiB
Urban
2432 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters12160
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUrban
2nd rowUrban
3rd rowUrban
4th rowUrban
5th rowUrban

Common Values

ValueCountFrequency (%)
Urban2432
95.4%
(Missing)118
 
4.6%

Length

2022-09-22T20:52:18.428058image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-22T20:52:18.529202image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
urban2432
100.0%

Most occurring characters

ValueCountFrequency (%)
U2432
20.0%
r2432
20.0%
b2432
20.0%
a2432
20.0%
n2432
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter9728
80.0%
Uppercase Letter2432
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r2432
25.0%
b2432
25.0%
a2432
25.0%
n2432
25.0%
Uppercase Letter
ValueCountFrequency (%)
U2432
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin12160
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U2432
20.0%
r2432
20.0%
b2432
20.0%
a2432
20.0%
n2432
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII12160
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U2432
20.0%
r2432
20.0%
b2432
20.0%
a2432
20.0%
n2432
20.0%

LABEL2019
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing65
Missing (%)2.5%
Memory size20.0 KiB
Urban
2485 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters12425
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUrban
2nd rowUrban
3rd rowUrban
4th rowUrban
5th rowUrban

Common Values

ValueCountFrequency (%)
Urban2485
97.5%
(Missing)65
 
2.5%

Length

2022-09-22T20:52:18.600668image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-22T20:52:18.698463image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
urban2485
100.0%

Most occurring characters

ValueCountFrequency (%)
U2485
20.0%
r2485
20.0%
b2485
20.0%
a2485
20.0%
n2485
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter9940
80.0%
Uppercase Letter2485
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r2485
25.0%
b2485
25.0%
a2485
25.0%
n2485
25.0%
Uppercase Letter
ValueCountFrequency (%)
U2485
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin12425
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U2485
20.0%
r2485
20.0%
b2485
20.0%
a2485
20.0%
n2485
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII12425
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U2485
20.0%
r2485
20.0%
b2485
20.0%
a2485
20.0%
n2485
20.0%

LABEL2020
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size20.0 KiB
Urban
2550 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters12750
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUrban
2nd rowUrban
3rd rowUrban
4th rowUrban
5th rowUrban

Common Values

ValueCountFrequency (%)
Urban2550
100.0%

Length

2022-09-22T20:52:18.777535image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-22T20:52:18.864399image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
urban2550
100.0%

Most occurring characters

ValueCountFrequency (%)
U2550
20.0%
r2550
20.0%
b2550
20.0%
a2550
20.0%
n2550
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter10200
80.0%
Uppercase Letter2550
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r2550
25.0%
b2550
25.0%
a2550
25.0%
n2550
25.0%
Uppercase Letter
ValueCountFrequency (%)
U2550
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin12750
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U2550
20.0%
r2550
20.0%
b2550
20.0%
a2550
20.0%
n2550
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII12750
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U2550
20.0%
r2550
20.0%
b2550
20.0%
a2550
20.0%
n2550
20.0%

Interactions

2022-09-22T20:52:12.358101image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:51:58.758464image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:00.224537image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:01.564566image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:02.815600image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:04.272214image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:05.575254image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:07.147180image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:08.457594image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:09.616320image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:11.082131image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:12.463888image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:51:58.878510image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:00.345305image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:01.681579image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:02.920829image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:04.379844image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:05.699092image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:07.261541image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:08.561953image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:09.729793image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:11.194710image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:12.576689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:51:59.008428image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:00.478187image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:01.803058image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:03.227390image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:04.490732image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:05.826247image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:07.379916image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:08.672395image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:09.847266image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:11.315240image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:12.672936image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:51:59.289561image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:00.593055image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:01.911650image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:03.334521image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:04.600063image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:05.944561image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:07.486220image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:08.768214image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:09.954303image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:11.422960image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:12.773877image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:51:59.407215image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:00.707041image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:02.016130image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:03.436541image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:04.712159image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:06.070023image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:07.598101image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:08.867553image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:10.063681image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:11.535588image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:12.875007image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:51:59.525160image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:00.824419image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:02.126619image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:03.549662image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:04.820035image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:06.194214image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:07.707940image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:08.964672image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:10.173576image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:11.646257image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:12.989863image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:51:59.654549image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:00.952861image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:02.249065image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:03.676196image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:04.947544image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:06.335333image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:07.832103image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:09.076838image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:10.297205image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:11.766794image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:13.103446image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:51:59.774005image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:01.076195image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:02.367159image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:03.803343image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:05.069527image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:06.476615image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:07.955355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:09.191586image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:10.418094image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:11.889730image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:13.204343image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:51:59.877444image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:01.189023image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:02.474326image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:03.915090image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:05.181211image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:06.601637image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:08.064619image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:09.289825image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:10.527517image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:12.000662image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:13.318386image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:51:59.995904image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:01.315378image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:02.591600image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:04.035525image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:05.348406image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:06.739302image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:08.202644image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:09.403358image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:10.831084image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:12.123955image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:13.433143image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:00.116178image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:01.443106image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:02.709064image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:04.156689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:05.472347image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:06.870468image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:08.337179image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:09.517383image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:10.965383image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-22T20:52:12.246623image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-09-22T20:52:18.939590image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-09-22T20:52:19.134807image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-09-22T20:52:19.293378image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-09-22T20:52:19.433414image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-09-22T20:52:19.566877image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-09-22T20:52:13.616653image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-09-22T20:52:13.925829image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-09-22T20:52:14.096864image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-09-22T20:52:14.400068image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexLATLON20132014201520162017201820192020LABEL2013LABEL2014LABEL2015LABEL2016LABEL2017LABEL2018LABEL2019LABEL2020
010717.377578.047500.468070.506650.466390.454270.476990.456790.426160.50689UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
110817.382578.047500.431320.453770.435750.413070.437160.404920.364290.46852UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
214317.377578.052510.466020.509110.468740.453440.486920.459350.423630.50702UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
324617.332578.062500.385570.412820.376430.380420.413500.363580.371190.42814UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
424717.337578.062500.400160.402650.382720.381560.423160.359240.386130.43244UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
534717.622578.067500.497960.510190.498740.433380.486660.488040.445420.48810UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
634817.627578.067500.516610.517540.501180.440120.494410.497610.448850.47394UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
742517.607578.072500.382810.395610.370050.333640.374990.374560.365130.38423UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
842617.612578.072500.413920.436160.412720.363660.395840.389240.377420.40263UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
942717.617578.072500.463410.484400.472810.427630.456750.449230.436280.46816UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban

Last rows

df_indexLATLON20132014201520162017201820192020LABEL2013LABEL2014LABEL2015LABEL2016LABEL2017LABEL2018LABEL2019LABEL2020
25402537917.522578.892500.337120.345440.328940.313980.344760.336250.328000.35803UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
25412538017.527578.892500.399170.413800.379520.380310.403780.392190.372310.43909UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
25422538117.532578.892500.409760.428360.395470.389010.425720.402160.387480.49498UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
25432538217.537578.892500.397580.412590.385980.368140.410940.391540.386730.48909UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
25442539917.252578.897500.339240.319750.291710.282820.317510.292690.298100.33698UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
25452545317.522578.897500.371480.376230.354560.334440.372310.357860.359580.38779UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
25462545517.532578.897500.406060.427290.389140.387070.413920.388660.394210.47712UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
25472547317.252578.902500.361360.334570.308540.301660.351360.316150.302670.35836NoneNoneNoneNoneNoneNoneNoneUrban
25482577417.462578.922500.414240.407720.388190.364890.431170.429930.403440.48538UrbanUrbanUrbanUrbanUrbanUrbanUrbanUrban
25492580017.232578.927510.393870.373870.366440.394120.445410.412600.333600.40543NoneNoneNoneNoneNoneNoneNoneUrban